Using Provenance to Extract Semantic File Attributes

نویسندگان

  • Daniel W. Margo
  • Robin Smogor
چکیده

Rich, semantically descriptive file attributes are valuable in many contexts, such as semantic namespaces and desktop search. Descriptive attributes help users to find files placed in seemingly-arbitrary locations by different applications. However, extracting semantic attributes from file contents is nontrivial. An alternative is to examine file provenance: how and when files are used, and the agents that use them. We study the extraction of semantic attributes from file provenance by applying data mining and machine learning techniques to file metadata. We show that provenance and other metadata predict semantic attributes such as file extensions. This complements previous work, which has shown that file extensions predict access patterns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BinComp: A stratified approach to compiler provenance Attribution

Compiler provenance encompasses numerous pieces of information, such as the compiler family, compiler version, optimization level, and compiler-related functions. The extraction of such information is imperative for various binary analysis applications, such as function fingerprinting, clone detection, and authorship attribution. It is thus important to develop an efficient and automated approa...

متن کامل

SAF: A Provenance-Tracking Framework for Interoperable Semantic Applications

This paper describes the foundations of a framework for constructing interoperable semantic applications that support recording of provenance information. The framework uses a client-server infrastructure to control the encoding of application. Provenance records for application components, settings, and data sources are stored as part of the final application file using the Open Provenance Mod...

متن کامل

Principles of Provenance Theme Proposal

Theme topic and brief description Recent research in a variety of settings (databases and data warehouses [10, 7, 20, 8], file systems [17], geographic information systems [6], scientific workflows and grid computation [19, 13], archiving and digital curation [1], and the Semantic Web [14]) has considered the problem of keeping track of metadata about creation and modification history, influenc...

متن کامل

A procedure for Web Service Selection Using WS-Policy Semantic Matching

In general, Policy-based approaches play an important role in the management of web services, for instance, in the choice of semantic web service and quality of services (QoS) in particular. The present research work illustrates a procedure for the web service selection among functionality similar web services based on WS-Policy semantic matching. In this study, the procedure of WS-Policy publi...

متن کامل

Facilitating Trust on Data through Provenance

Research on trusted computing focuses mainly on the security and integrity of the execution environment, from hardware components to software services. However, this is only one facet of the computation, the other being the data. If our goal is to produce trusted results, a trustworthy execution environment is not enough: we also need trustworthy data. Provenance of data plays a pivotal role in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010